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From  observations  and  conservation  laws, 

ocean  scientists  formulate  models  that  aim 
to  explain  and  predict  dynamics  of  the  sea. 


A  multitude  of  physical  and  biological  processes 
occur  in  the  ocean  over  a  wide  range  of  temporal 
and  spatial  scales.  Many  of  these  processes  are  non¬ 
linear  and  highly  variable,  and  involve  interactions 
across  several  scales  and  oceanic  disciplines.  For 
example,  sound  propagation  is  influenced  by  physi¬ 
cal  and  biological  properties  of  the  water  column 
and  by  the  seabed.  From  observations  and  conser¬ 
vation  laws,  ocean  scientists  formulate  models  that 
aim  to  explain  and  predict  dynamics  of  the  sea. 

This  formulation  is  intricate  because  it  is  challeng¬ 
ing  to  observe  the  ocean  on  a  sustained  basis  and  to 
transform  basic  laws  into  generic  but  usable  mod¬ 
els.  There  are  imperfections  in  both  data  and  model 
estimates.  It  is  important  to  quantify  such  uncer¬ 
tainties  to  understand  limitations  and  identify  the 
research  needed  to  increase  accuracies,  which  will 
lead  to  fundamental  progress. 

There  are  several  sources  of  uncertainties  in  ocean 
modeling.  First,  to  simplify  models  (thereby  reduc¬ 
ing  computational  expenses),  explicit  calculations  are 
only  performed  on  a  restricted  range  of  spatial  and 
temporal  scales  (referred  to  as  the  “scale  window”) 


(Nihoul  and  Djenidi,  1998).  Influences  of  scales  out¬ 
side  this  window  are  neglected,  parameterized,  or 
provided  at  boundaries.  Such  simplifications  and 
scale  reductions  are  a  source  of  error.  Second,  un¬ 
certainties  also  arise  from  the  limited  knowledge  of 
processes  within  the  scale  window,  which  leads  to 
approximate  representations  or  parameterizations. 
Third,  ocean  data  are  required  for  model  initializa¬ 
tion  and  parameter  values;  however,  raw  measure¬ 
ments  are  limited  in  coverage  and  accuracy,  and  they 
are  often  processed  with  the  aim  of  extracting  infor¬ 
mation  within  a  predetermined  scale  window.  Initial 
conditions  and  model  parameters  are  thus  inexact. 
Fourth,  models  of  interactions  between  the  ocean 
and  Earth  system  are  approximate  and  ocean  bound¬ 
ary  conditions  are  inexact.  For  example,  effects  of 
uncertain  atmospheric  fluxes  can  dominate  oceanic 
uncertainty.  Fifth,  miscalculations  occur  due  to  nu¬ 
merical  implementations.  All  of  the  above  leads  to 
differences  between  the  actual  values  (unknown)  and 
the  measured  or  modeled  values  of  physical,  biologi¬ 
cal,  and  geo-acoustical  fields  and  properties. 
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To  reduce  uncertainties,  the  sources  of 
information  (the  various  data  and  dy¬ 
namical  models)  are  combined  by  data 
assimilation  (DA)  (Robinson  et  al.,  1998; 
Robinson  and  Lermusiaux,  2002).  Data 
assimilation  is  challenging  and  expensive 
to  carry  out,  but  optimal  in  the  sense 
that  each  type  of  information  is  weight¬ 


ed  in  accord  with  its  uncertainty.  Of 
course,  should  optimal  estimates  fail  to 
be  accurate,  a  priori  assumptions  about 
uncertainties  are  revised,  and  models 
and  data  sets  improved. 

Any  comprehensive  ocean  prediction 
(e.g.,  Mooers,  1999;  Pinardi  and  Woods, 
2002)  should  include  uncertainty  esti¬ 
mates.  Predicted  uncertainties  consist  of 
the  integration  in  time  of  initial  errors 
and  of  errors  introduced  during  model 
integration.  Uncertainty  is  defined  in 
terms  of  the  probability  density  function 
(PDF)  of  the  error  in  the  estimate.  Error 
refers  to  the  difference  between  the  truth 
and  the  estimate.  Uncertainties  are  often 
represented  by  low- order  characteris¬ 
tics  of  the  error  PDF  (e.g.,  the  moments 
or  confidence  intervals).  Because  ocean 
fields  are  four-dimensional,  straightfor¬ 
ward  uncertainty  representations  are 
also  fields,  with  structures  in  time  and 
space.  Variability  and  uncertainty  are 
related  but  different  (e.g.,  Lermusiaux, 


2002).  For  any  estimate,  the  portion  of 
variability  that  contains  errors  contrib¬ 
utes  to  uncertainty.  The  variability  that 
is  unresolved  is  purely  uncertainty.  For 
example,  the  historical  temperature  vari¬ 
ability  maps  shown  on  Figure  1  are  mo¬ 
ments  of  a  variability  PDF.  The  standard 
deviations  (Figure  lb)  are  uncertainty 


amplitudes  for  the  mean  (Figure  la)  if 
the  historical  data  are  the  sole  informa¬ 
tion  used  to  estimate  this  mean. 

Although  uncertainties  have  been  at 
the  heart  of  ocean  investigations  for  a 
long  time,  realistic  uncertainty  predic¬ 
tions  are  recent.  Early  attempts  in  the 
context  of  DA  are  described  in  Malan- 
otte-Rizzoli  (1996).  The  first  real-time 
uncertainty  predictions  using  an  ad¬ 
vanced  DA  scheme  in  a  full-featured 
nonlinear  model  were  carried  out  for 


the  Strait  of  Sicily  in  1996  (Lermusiaux, 
1999).  The  scheme  utilized  was  Error 
Subspace  Statistical  Estimation  (ESSE, 
Lermusiaux  et  al.,  2002).  Related  Monte- 
Carlo  ensemble  schemes  (Evensen,  1994; 
Miller  et  al.,  1999)  are  now  being  used 
in  other  regions.  Generalized  inverse 
schemes  can  account  for  all  uncertainty 
sources  (an  excellent  example  is  Egbert 
et  al.,  1994),  but  avoid  computing  uncer¬ 
tainty  fields  to  gain  computational  speed. 
In  atmospheric  studies,  ensemble  fore¬ 
casting  has  been  utilized  for  uncertainty 
predictions  for  some  time  (e.g.,  Toth  and 
Kalnay,  1993;  Molteni  et  al.,  1996;  Ehren- 
dorfer,  1997)  and  realistic  ensemble  DA 
has  been  carried  out  recently  (Whitaker 
et  al.,  2004;  Houtekamer  et  al.,  2005;  Szu- 
nyogh  et  al.,  2005).  Climate  uncertainty 
forecasting  has  been  initiated,  often  based 
on  simple  perturbations  of  selected  pa¬ 
rameters  and  initial  conditions  (Murphy 
et  al.,  2004;  Stainforth  et  al.,  2005). 

The  present  study  describes  and  illus¬ 
trates  the  mechanics  and  computations 
involved  in  modeling  and  predicting  un¬ 
certainties  for  ocean  science  and  its  mod¬ 
ern  applications.  It  is  an  outgrowth  of  the 
U.S.  Office  of  Naval  Research’s  (ONR) 
Capturing  Uncertainty  in  the  Tactical 
Environment  Initiative  (ONR,  2001), 
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The  evolution  of  error  covariances  depends  on  four 

factors:  (1)  the  initial  error  condition,  (2)  the  deterministic 
dynamics  that  increase  or  reduce  errors...,  (3)  the 

stochastic  forcings  that  model  errors  in  the  deterministic 
model...,  and  (4)  the  impact  of  data  that  reduces  variance. 
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(a)  Summer  mean  surface  (0-15  m)  temperature 
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(b)  Summer  standard  deviation  of  surface  (0-15  m)  temperature 


Figure  1.  Measurement-based  variability  estimates  for  the  summer  season  in  the  Mid-Atlantic  Bight  (MAB)  continental  shelf  and  slope  region,  within 
0-m  to  15-m  depths,  (a)  Mean  of  the  temperature  data,  in  °C.  (b)  Standard  deviation  of  the  temperature  data,  in  °C.  The  data  used  to  compute  these 
maps  are  historical  raw  temperature  profiles  from  a  variety  of  data  sources  (Linder  and  Gawarkiewicz,  in  press).  The  maps  are  representations  of  the 
variability — the  mean  and  standard  deviation  of  a  variability  PDF. 


which  involved  scientists  from  physical 
oceanography,  ocean  modeling,  marine 
geosciences,  ocean  acoustics,  signal  pro¬ 
cessing,  and  sonar  engineering.  Detailed 
mathematical  and  computational  aspects 
are  given  in  the  references.  ESSE  is  used 
to  exemplify  interdisciplinary  data-assim- 
ilative  uncertainty  estimation  and  predic¬ 
tion,  focusing  on  regional  applications. 

MODELING  AND  PREDICTING 
UNCERTAINTIES  IN  THE  OCEAN 

Uncertainty  estimation  begins  with  the 
identification  of  significant  variability 
that  is  not  represented.  This  is  the  input 
or  prior  information.  Uncertainty  predic¬ 
tions  can  then  be  obtained  from  evolu¬ 
tion  equations  (Jazwinski,  1970)  for  the 
error  PDFs  of  the  model  state  and  pa¬ 
rameters.  When  observations  are  made, 
these  PDFs  are  combined  with  the  new 
data  and  their  PDFs.  However,  it  is  im¬ 
practical  to  solve  such  PDF  equations  for 
discrete  ocean-model  variables  because 
of  the  large  number  0(105-107)  of  grid 
points.  Ocean  uncertainty  estimation 


has  thus  focused  on:  (1)  the  conditional 
mean,  which  is  the  minimum  error  vari¬ 
ance  estimate,  and  (2)  error  variances 
and  covariances,  which  are  simple  but  es¬ 
sential  components  of  the  error  statistics. 
(The  variance  is  the  square  of  the  stan¬ 
dard  deviation,  which  is  a  measure  of  the 
average  deviation  from  the  mean.  Error 
covariances  measure  the  extent  to  which 
errors  in  two  variables  vary  together.) 

The  evolution  of  error  covariances 
depends  on  four  factors:  (1)  the  initial 
error  condition,  (2)  the  deterministic 
dynamics  that  increase  or  reduce  errors 
by  internal  advection,  diffusion,  or  re¬ 
action,  and  by  external  forcing,  (3)  the 
stochastic  forcings  that  model  errors  in 
the  deterministic  model  and  increase  er¬ 
ror  variance,  and  (4)  the  impact  of  data 
that  reduces  variance.  Each  of  these  fac¬ 
tors  is  normally  important.  Care  is  thus 
required  when  approximate  equations 
are  used  to  evolve  error  covariances.  For 
example,  a  passive  tracer  equation  would 
only  capture  part  of  factor  2. 

Ocean  uncertainty  forecasts  can  be 


used  to  qualify  the  prediction,  assimi¬ 
late  data,  or  estimate  predictability  lim¬ 
its.  Today,  most  uncertainty  forecasting 
schemes  are  based  on  ensemble  Monte- 
Carlo  approaches  and  reductions  of  the 
high- dimensional  error  space  to  a  low¬ 
dimensional  subspace  that  contains  the 
essential  uncertainty.  The  schemes  first 
aim  to  account  for  the  largest  uncer¬ 
tainties  in  each  source  of  information: 
dynamical  model,  measurement  model/ 
data,  initial  and  boundary  conditions, 
and  parameters.  With  these  uncertainty 
inputs,  they  then  predict  the  largest  un¬ 
certainties  (the  error  subspace)  of  the 
dynamical  state  and  reduce  them  by  DA. 
Mathematically,  it  is  the  DA  criterion 
that  sets  the  choice  of  the  subspace.  The 
suboptimal  truncation  of  errors  in  the 
full  space  is  then  optimal.  For  a  mini¬ 
mum  error  variance,  the  subspace  is  de¬ 
fined  by  dominant  modes  of  the  error 
covariance  matrix.  Computational  com¬ 
ponents  involved  in  such  modeling  and 
prediction  of  uncertainties  are  outlined 
next  and  illustrated  with  ESSE. 
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Deterministic  Models  and  Their 
Approximations 

Most  physical  ocean  models  are  derived 
from  the  classic  Navier- Stokes  equations 
for  fluid  dynamics  in  a  rotating  frame  of 
reference.  These  equations  are  determin¬ 
istic:  they  always  give  the  same  output 
for  a  given  input.  Practical  assumptions 
are  used  to  limit  the  range  of  modeled 
scales.  A  common  reduction,  the  Primi¬ 
tive-Equations  (PE)  model  (Pedlosky, 
1987),  is  used  here  within  the  Harvard 
Ocean  Prediction  System  (HOPS,  2004). 
Acoustic  models  are  also  derived  from 
Navier-Stokes  and  are  usually  based  on 
a  wave  equation  for  the  sound  pressure 
(Kuperman,  2004).  Efficient  acoustic 
model  approximations  include  modal 
decompositions  (Chiu  et  al.,  1996)  and 
linearizations.  Even  though  much  prog¬ 
ress  has  been  made  in  marine  ecosystem 
modeling  (e.g.,  Hofmann  and  Friedrichs, 
2002),  deterministic  biological  equations 
as  fundamental  as  Navier-Stokes  are  not 
yet  available.  For  lower  trophic  levels, 
most  models  are  based  on  advection-re- 
action-diffusion  equations.  They  differ  in 
their  structure,  the  number  of  state  vari¬ 
ables  employed,  and  the  parameteriza- 
tions  used.  Details  on  the  models  used  in 
this  manuscript  are  in  Lermusiaux  et  al. 
(2002)  and  Lermusiaux  and  Chiu  (2002). 

Deterministic  physical,  biological,  or 
acoustical  models  commonly  compute 
future  conditions  based  on  given  initial 
conditions.  They  also  play  an  important 
role  in  the  prediction  of  uncertainties. 
They  allow  explaining  the  deterministic 
evolution  of  the  initial  errors.  However, 
approximations  to  fundamental  equa¬ 
tions  lead  to  errors  in  these  models, 
which  need  to  be  taken  into  account. 


Stochastic  Forcing  and  Models 
of  Uncertainties  in  Deterministic 
Dynamical  Models 

To  represent  the  dominant  components 
of  processes  neglected  or  not  well  repre¬ 
sented  in  deterministic  models,  stochas¬ 
tic  error  models  are  starting  to  be  used. 
For  example,  Figure  2  illustrates  statisti¬ 
cal  effects  of  sub-mesoscale  processes 
not  resolved  in  a  mesoscale-resolution 
PE  model.  The  model  errors  are  mod¬ 
eled  using  unbiased  random  noise  with 
an  exponential  decorrelation  in  time.  In 
space,  the  amplitude  of  the  noise  var¬ 
ies  only  in  the  vertical  direction  and  has 
about  a  two-grid  point  correlation  in  ev¬ 
ery  spatial  direction.  For  each  prognostic 
equation,  the  noise  variance  at  a  given 
depth  is  set  to  be  a  small  fraction  (25 
percent)  of  the  amplitude  of  the  terms 
involved  in  the  dominant  dynamical  bal¬ 
ance  at  that  depth. 

What  is  modeled  with  a  determinis¬ 
tic  or  stochastic  equation  depends  on 
knowledge  and  on  the  scale  window  of 
interest.  Generally,  processes  that  are  well 
known  in  this  window  are  modeled  de¬ 
terministically.  All  other  processes,  inside 
or  outside  of  the  scale  window,  should  be 
modeled  with  a  stochastic  component. 
These  stochastic  components  can  be  ei¬ 
ther  additive  (added  as  a  new  term  to 
the  deterministic  model)  or  multiplica¬ 
tive  (e.g.,  inside  an  original  term  of  the 
model).  Additive  forcing,  uncorrelated 
with  the  deterministic  variables,  is  useful 
in  ocean  models,  but  it  should  be  auto- 
correlated  in  time  and  space  because  the 
statistics  of  many  natural  processes  can 
be  approximated  this  way  (Gardiner, 
1983;  Lermusiaux  et  al.,  2002). 


Boundary  Condition  Uncertainties 

Open  boundary  conditions  in  regional 
modeling  are  a  large  source  of  uncertain¬ 
ties,  in  part  because  their  estimation  is 
not  always  well  posed  (Bennett,  1992, 
and  references  therein).  Moreover,  ex¬ 
changes  between  the  ocean  and  atmo¬ 
sphere  are  often  computed  based  on  at¬ 
mospheric  forcing  fluxes  obtained  from 
an  independent  atmospheric  model. 
Inaccuracies  also  arise  in  surface  and 
coastal  boundary  conditions  such  as 
parameterizations  of  boundary  layers, 
fluxes  exchanged  at  coastlines,  or  river 
discharge  inputs.  Most  boundary  con¬ 
dition  uncertainties  are  modeled  with 
simple  stochastic  forcing  and  can  be 
underestimated  to  limit  numerical  insta¬ 
bilities.  For  example,  ESSE  currently  uses 
white  noise  models  or  time-correlated 
noise  models  at  boundaries.  At  the  ocean 
surface,  more  advanced  atmospheric  flux 
uncertainty  models  are  definitely  needed, 
for  example,  to  account  for  flow-depen- 
dent  uncertainties. 

Parametric  Uncertainties 

To  motivate  the  need  for  representing 
parametric  uncertainties,  consider  the 
fit  of  the  mixing-layer  depth  in  a  param¬ 
eterization  of  the  transfer  of  wind  stress 
to  the  ocean’s  surface  boundary  layer 
(Lermusiaux,  2001).  Figure  3  illustrates 
such  a  fit  of  the  mixing-layer  depth  fac¬ 
tor  to  Seasoar  data  collected  during  the 
shelfbreak  PRIMER  experiment  and  to 
atmospheric  fluxes  obtained  from  ad¬ 
justed  model  fields  (Baumgartner  and 
Anderson,  1999).  The  fitted  factor  varies 
in  time  (solid  curve  on  top  of  Figure  3). 
Its  uncertainty  is  represented  by  the  his¬ 
togram  around  the  mean  fit  (Figure  3, 
bottom).  As  the  top  panel  shows,  in  the 
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Figure  2.  Illustration  of  the  effects  of  adding  random  forcings,  correlated  in  time  and  space,  to  a  PE  model.  Shown  are  differences  be¬ 
tween  a  deterministic  and  stochastic  PE  model  simulation  over  the  Monterey  Bay  and  California  Current  System  region,  after  one  day 
of  integration.  Differences  in  horizontal  maps  of  T  and  | \u.\ |  are  shown  at  30-m  depth  (top  and  bottom  left)  and  in  cross  sections  (from 
offshore  to  the  coast  in  Monterey  Bay)  of  T  and  u,  from  0-200  m  depth.  The  amplitudes  of  the  random  forcings  were  set  to  a  fraction  (25 
percent)  of  the  average  geostrophic  balance  at  each  depth  (||geostrophy(z)||),  with  a  half-day  decorrelation  in  time  and  one-to-two  grid 
point  correlation  in  space.  Ceostrophy  is  not  always  the  sole  component  of  the  dominant  PE  balance  in  the  region.  Future  random  forc¬ 
ing  models  will  for  example  include  impacts  of  atmospheric  forcing  in  the  balance. 


ideal  case,  the  value  of  the  Ekman  fac¬ 
tor  should  be  adapted  in  real  time  to  the 
wind  and  mixed-layer  depth  data. 

Most  ocean  model  parameters  are 
kept  constant  in  time  and  space.  Ide¬ 


ally,  uncertainties  of  influential  param¬ 
eters  should  be  modeled  in  a  prediction 
or  error  budget.  Parameter  values  can 
be  estimated  directly  by  DA.  Priors  are 
then  assigned  for  each  parameter  and 


posteriors  are  the  result  of  the  DA.  In 
biological  estimation,  such  quantitative 
parameter  estimation  can  be  necessary 
to  achieve  meaningful  results  (e.g.,  Spitz 
et  al.,  2001). 
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Fitted  Ekman  Factor:  0.0586 


Figure  3.  Uncertainties  in  multiple  model 
parameters.  Example  of  the  mixing-layer 
depth  factor  over  the  MAB  shelfbreak  front 
region.  This  factor  is  the  proportionality  "Ek¬ 
man  factor  (E  )”  linking  the  Ekman  depth 
to  the  turbulent  friction  velocity  and  the 
Coriolis  frequency.  Top:  Fitted  mixed-layer 
depth  (dark  continuous  curve)  and  estimat¬ 
ed  mixed-layer  depths  (crosses)  computed 
for  every  Seasor  ocean  data  profile  and  the 
corresponding  local  atmospheric  model 
of  wind-stress  (Baugmarter  and  Anderson, 
1996)  at  the  profile  location  and  time.  Bot¬ 
tom:  Distribution  of  the  misfit  between  the: 
(1)  mixed-layer  depths  estimated  from  the 
wind-stress  and  the  ocean  data;  and  (2),  the 
fitted  mixed-layer  depth  (dark  curve  on  top). 
The  fitted  Ekman  factor  is  0.0586. 


Measurement  Models 

The  platforms  and  sensors  used  today 
in  the  ocean  provide  a  wide  range  of 
observations  on  physical,  biological, 
acoustical,  and  geological  features,  from 
microstructures  to  climate  (Dickey,  2003; 
Griffiths  et  al.,  2002).  Melding  these  data 
with  dynamical  models  requires  mea¬ 
surement  models  that  link  the  dynamical 
model  variables  and  parameters  to  the 
observations.  These  models  include  un¬ 
certainties  because  sensors,  data  process¬ 
ing,  and  relationships  among  data  and 
dynamical  variables  are  approximate. 
Even  though  uncertainties  in  measure¬ 
ment  models  can  be  complex,  simple 
data  noise  models  are  often  used.  For 
example,  the  ESSE  system  uses  measure¬ 
ment  errors  that  are  correlated  only  in 
the  vertical  and  that  have  amplitudes  as  a 
function  of  depth  only. 


Uncertainty  Initialization 

Dynamics  and  historical  data  are  used 
for  uncertainty  initialization.  In  ESSE, 
the  dominant  error  modes  are  initialized 
in  two  steps:  what  is  observed  first,  then 
what  is  not  observed.  The  “observed 
portions”  are  those  that  can  be  esti¬ 
mated  from  differences  between  a  back¬ 
ground  state  and  historical  data.  Synop¬ 
tic  data  can  also  be  used  to  specify  un¬ 
certainties  corresponding  to  unresolved 
scales  in  the  background.  The  “non-ob¬ 
served”  portions  are  then  computed  by 
dynamical  inference.  First,  the  observed 
portions  are  used  to  perturb  the  back¬ 
ground.  An  ensemble  of  model  integra¬ 
tions  is  then  carried  out  to  adjust  the 
non-observed  portions  to  the  observed 
ones.  The  result  is  an  ensemble  of  com¬ 
plete  uncertainty  samples  from  which 
one  can  estimate  dominant  modes  of 


the  initial  error  covariance.  This  ap¬ 
proach  can  be  generalized  to  multiple 
scales  (Lermusiaux,  2002). 

To  obtain  an  ensemble  of  states,  the 
background  state  is  perturbed  by  a  com¬ 
bination  of  the  initial  error  modes,  with 
dynamical  constraints.  A  white  noise 
model  is  also  used  to  model  uncertain¬ 
ties  truncated  by  projection  in  the  sub¬ 
space.  A  detailed  exposition  of  other 
ensemble  generation  methods  can  be 
found  in  Miller  and  Ehret  (2002). 

Data  Assimilation  and  Uncertainty 
Reduction 

In  DA,  the  data,  measurement  models, 
and  dynamical  models  are  combined  in 
accord  with  their  prior  uncertainty  esti¬ 
mates  using  a  criterion  that  determines 
the  weight  of  each  source  of  information. 
DA  can  provide  melded  estimates  of  the 
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state,  parameters,  and  even  of  the  model 
of  the  ocean.  Various  methods  for  DA  in 
meteorology  and  oceanography  can  be 
found  in  Bennett  (1992),  Wunsch  (1996), 
Robinson  et  al.  (1998),  and  Kalnay 
(2003).  Schemes  are  derived  from  estima¬ 
tion  theory,  control  theory  (variational 
approaches),  or  optimization  theory.  Es¬ 
timation  theory  schemes  solve  a  forward/ 
filtering  problem  or  a  smoothing/inverse 
problem.  Control  theory  schemes  solve  a 
smoothing  problem.  Almost  all  schemes 
are  linked  to  a  minimization  of  an  error 
norm,  the  DA  criterion.  Optimization 
theory  schemes  directly  minimize  such 
a  criterion  or  cost  function.  For  real  DA, 
most  methods  are  based  on  least-squares 
norms  and  focus  on  the  conditional 
mean  and  error  covariance  matrix. 

Posterior  uncertainties  and  data-mod- 
el  misfits  can  be  used  for  two  essential 
DA  feedbacks.  First,  adaptive  sampling 
estimates  the  types  and  locations  of 
the  observations  that  are  most  needed 
(Bishop  et  al.,  2001;  Lermusiaux,  1999). 
Second,  adaptive  modeling  identifies 
the  model  properties  that  need  the  most 
improvements  (Lermusiaux  et  al.,  2004). 
These  feedbacks  lead  to  improved  under¬ 
standing  and  can  be  most  powerful  when 
uncertainty  estimates  are  available. 

METHODOLOGIES  AND 

INTERDISCIPLINARY 

APPLICATIONS 

Error  Subspace  Statistical 
Estimation 

There  are  seven  components  in  ESSE: 
error  subspace  initialization,  state  and 
uncertainty  prediction,  minimum  er¬ 
ror  variance,  adaptive  error  correction, 
smoothing,  and  adaptive  sampling.  The 


ESSE  uncertainty  initialization  is  based 
on  data,  model,  and  multi-scale  decom¬ 
positions.  The  prediction  is  obtained 
from  a  central  forecast  and  an  ensemble 
of  nonlinear  model  integrations  that 
include  random  forcing.  Currently,  verti¬ 
cally  correlated  noise  models  represent 
uncertainties  in  measurements  and  white 
noise  models  in  boundary  conditions 
and  parameters.  Computations  are  dis¬ 
tributed  on  a  set  of  computers.  Quan¬ 
titative  criteria  control  the  ensemble 
size.  When  they  are  satisfied,  data  and 
models  are  combined  by  minimum  er¬ 
ror  variance  in  the  error  subspace.  Data 
residuals  are  then  used  for  adaptive  error 
corrections.  To  update  past  estimates, 
smoothing  via  ESSE  is  run  backward  in 
time.  Adaptive  sampling  plans  are  pre¬ 
dicted  (Lermusiaux,  1999,  2001)  using 
uncertainty  estimates. 

ESSE  has  been  developed  for,  and 
applied  to,  fundamental  research  and 
real-time  operations.  The  ESSE  filtering/ 
smoothing  schemes  permit  physical,  bio¬ 


logical,  and  acoustical  DA  with  four-di¬ 
mensional  interdisciplinary  covariances. 
Physical  data  then  influence  the  biology 
and  acoustics,  and  vice  versa.  Computa¬ 
tional  complexities  in  ESSE  arise  from 
the  diversity  of  ocean  geometries,  data 
properties,  deterministic  and  stochastic 


parameters,  and  distributed  workflows. 
However,  many  computations  involve 
linear  algebra,  which  allows  the  use  of 
efficient  community  packages. 

Estimation  of  Uncertainties  in 
Secondary  Variables:  Coherent 
Structures 

Nowcasts  and  forecasts  are  commonly 
used  to  infer  secondary  or  diagnos¬ 
tic  quantities,  such  as  energy,  vorticity, 
bioluminescence,  acoustic  travel  time, 
drifter  paths,  or  other  Lagrangian  indica¬ 
tors.  For  example,  engineers  and  scien¬ 
tists  interested  in  the  trajectories  of  par¬ 
ticles  or  vehicles  in  the  ocean  compute 
Direct  Lyapunov  Exponent  (DLE)  and 
Lagrangian  Coherent  Structures  (LCSs) 
fields  from  velocity  fields  (Lekien  et  al., 
2005,  and  references  therein).  The  LCSs 
are  here  defined  as  ridges  in  the  DLE 
held.  They  represent  mobile  separatri- 
ces,  which  divide  the  flow  into  regions 
of  distinct  motions  and  can  indicate 
non-obvious  boundaries  in  complex, 


time-varying  hows.  Figure  4a  shows  the 
DLE  held  and  their  ridges  during  an  up- 
welling  event  in  Monterey  Bay.  The  LCSs’ 
ridges  clearly  separate  regions  of  differ¬ 
ent  properties.  For  example,  consider  the 
strong  LCS  between  the  cyclonic  circula¬ 
tion  in  the  Monterey  Bay  and  the  Cali- 


The  representation,  attribution  and 

propagation  of  four-dimensional  oceanic 
uncertainties  presents  many  interesting 

challenges  and  requires  increased  theoretical 
and  applied  research  efforts. 
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fornia  Current  System,  or  the  circular 
LCS  enclosing  fluid  trapped  in  an  eddy. 
The  utility  of  such  estimates  depends  on 
their  robustness  to  uncertainties.  Figure 
4b  shows  uncertainties  transferred  from 
the  ocean  state  to  the  DLE  state.  Major 
LCSs  are,  in  this  case,  regions  of  small 
relative  uncertainties,  making  them  ro¬ 
bust  descriptors  of  the  flow. 

Uncertainty  Predictions  for 
Acoustical  and  Physical  Fields  in  a 
Shelfbreak  Front 

The  main  hydrographic  feature  near  the 
Mid-Atlantic  Bight  shelfbreak  is  a  me- 
soscale  front  of  temperature,  salinity, 
and  hence  sound-speed,  separating  the 


shelf  and  slope  water  masses  (Figure  5a). 
The  frontal  system  is  variable  on  mul¬ 
tiple  temporal  and  spatial  scales.  Atmo¬ 
spheric  forcing,  Gulf  Stream  rings,  river 
inflows,  and  buoyancy-driven  flows,  as 
well  as  tides  and  internal  waves,  affect 
its  dynamics.  The  main  in  situ  data  were 
collected  during  July  26- August  4, 1996, 
mostly  over  an  intensive  acoustic  domain 
(Figure  5b),  as  part  of  the  ONR  Shelf¬ 
break  PRIMER  Experiment  (Lynch  et  al., 
1997).  The  ocean  model  was  substantially 
tuned  to  achieve  useful  physical-acousti¬ 
cal  simulations  (Robinson  and  Lermu- 
siaux,  2004).  ESSE  is  started  on  July  8 
from  a  National  Marine  Fisheries  Service 
survey.  The  central  forecast  on  July  24  is 


illustrated  on  Figures  5a  and  5b  by  hori¬ 
zontal  maps  of  temperature  (T)  at  10  m 
(note  the  large  meanders).  Fields  at  10- 
m  depth  are  strongly  influenced  by  both 
atmospheric  forcing  and  internal  ocean 
dynamics  (Figure  5a),  with  the  former 
imprinting  its  larger  scales  on  the  latter. 
The  zoom  around  the  acoustic  region 
(Figure  5b),  overlaid  with  horizontal  cur¬ 
rent  vectors,  illustrates  that  larger-scale 
ocean  context  (Figure  5a)  is  necessary  to 
understand  the  regional  acoustic  context. 
The  corresponding  error  standard  devia¬ 
tion  maps  (Figures  5c-d)  show  that  at 
10-m  depth,  the  largest  uncertainties  in  a 
16-day  prediction  without  DA  are  domi¬ 
nant  around  the  surfacing  location  of  the 


Figure  4.  (Left)  Direct  Lyapunov  Exponent  (DLE)  field  during  an  upwelling  event  in  Monterey  Bay  (August  26-29,  2003)  derived  from  a  ve¬ 
locity  forecast.  Ridges  in  the  DLE  field  are  highlighted  by  purple  lines  and  represent  Lagrangian  Coherent  Structures  (LCSs),  which  divide 
the  flow  between  regions  of  qualitatively  different  dynamics.  The  DLE  and  LCSs  were  computed  using  MANGEN  (http://www.mangen.info), 
HOPS,  and  ESSE.  (Right)  Relative  error  in  the  DLE  field  computed  from  the  Error  Subspace  Statistical  Estimation  (ESSE)  velocity  ensemble.  In 
relative  terms,  the  LCSs  repel  the  uncertainties  inherited  from  the  ocean  state. 
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front  and  its  meanders,  over  1 5-20  km 
on  each  side  of  the  mean  front.  This  dis¬ 
tance  is  about  twice  the  expected  inter¬ 
nal  Rossby  radius  of  deformation  of  the 
front.  Importantly,  at  depths  closer  to  the 


core  of  the  tilted  front  (30-40  m),  uncer¬ 
tainties  are  larger  and  more  uniform  in 
the  horizontal,  reflecting  the  frontal  tilt 
and  more  turbulent  nature  of  the  internal 
dynamics  of  the  front. 


After  DA  on  July  24,  ESSE  ocean  phys¬ 
ics  uncertainties  were  transferred  on  July 
26  to  acoustical  uncertainties  across  the 
shelfbreak  along  the  main  acoustic  verti¬ 
cal  section  (its  position  is  on  Figure  5). 
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Figure  5.  Simulated  temperature  field  at  10-m  depth  (5a— b)  over  the  MAB  shelfbreak  front  region  and  its  error  standard  deviation  estimate  (5c— d), 
as  computed  by  Harvard  Ocean  Prediction  System  (HOPS)  and  Error  Subspace  Statistical  Estimation  (ESSE).  Panels  (5a,  5c)  cover  the  whole  387  km 
by  360  km  numerical  ocean  domain,  at  3-km  resolution  in  the  horizontal.  Panels  (5b,  5d)  are  zooms  over  the  PRIMER  acoustic  domain  (89  km  by  85 
km),  overlaid  with  horizontal  velocities  uh  on  Panel  5b.  The  ensemble  ESSE  simulation  starts  on  July  8, 1996,  from  historical  and  feature  model  data 
and  an  error  covariance  estimate.  The  plotted  fields  are  16-day  hindcasts  for  July  24.  No  in  situ  data  were  available  in  the  acoustic  region  during  these 
first  16  days.  The  position  of  the  main  acoustic  section,  from  (40.002°N,  71.163°W)  to  (40.368°N,  71.226°W)  is  also  shown,  on  each  panel. 
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Time  is  fixed  and  an  acoustic  broadband 
Transmission  Loss  (TL)  field  is  comput¬ 
ed  for  each  ocean  realization  of  the  ESSE 
ensemble,  using  the  coupled  normal¬ 
mode  model  of  Chiu  et  al.  (1996).  The 
450-Hz  sound  source  is  at  300-m  depth, 
near  the  deepest  point  on  the  slope.  The 
mean  and  standard  deviation  of  the 
coupled  physical-acoustical  fields  along 


the  section  are  shown  in  Figure  6.  The 
mean  sound  speed  held  is  characteristic 
of  the  shelfbreak  front  in  summer  (tilted 
front  and  surface  thermocline).  The  TL 
held  shows  the  sound  attenuation  in 
the  surface  mixed  layer  over  the  shelf 
and  the  funneling  of  sound  in  the  sub¬ 
surface  duct  (colder  shelf  waters).  The 
largest  error  standard  deviations  in  the 


sound  speed  on  July  26  are  in  the  core 
of  the  front  (30-40-m  depth  and  range 
of  2-7  km),  along  the  tilted  front,  and  in 
the  surface  thermocline  on  the  shelf.  The 
largest  error  standard  deviations  in  TL 
are  close  to  the  source,  near  the  foot  of 
the  front  and  on  the  shelf.  At  the  receiver 
vertical  line  array  near  41 -km  range, 
they  vary  from  2  to  3  dB. 
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Figure  6.  Cross-sections  on  July  26, 1996  in  the  mean  and  error  standard  deviation  of  the  sound-speed  (top)  and  broadband  TL  (bottom),  as 
estimated  by  ESSE  along  the  main  acoustic  section  (western  section  of  the  PRIMER  experiment).  Note  that  if  we  also  model  uncertainties  in 
the  bottom  attenuation  coefficient  (not  shown),  we  find  that  the  mean  TL  remains  similar,  but  the  TL  error  standard  deviation  increases  sub¬ 
stantially  in  the  surface  mixed  layer  above  the  shelf.  This  is  a  result  of  the  uncertain  attenuation  at  each  acoustic  bottom  bounce  and  shows  the 
importance  of  seabed  uncertainties  in  acoustic  predictions. 
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Physical-Biogeochemical 
Uncertainties  in  Massachusetts  Bay 

Uncertainty  predictions  are  carried 
out  for  Massachusetts  Bay’s  ecosystem, 
using  HOPS,  ESSE,  and  physical  and 
biological  data  collected  during  sum¬ 
mer  1998  (Beskitepe  et  ah,  2003).  The 
physics  hindcast  is  initialized  for  August 
20, 1998.  Biogeochemical  fields  are  also 
initialized  for  that  period.  The  initial 
physical-biogeochemical  covariance  is 
estimated  in  two  steps.  Vertical  EOFs  of 
profiles  of  temperature  (T),  salinity  (S), 
chlorophyll  a  (Chi),  nitrate  (N03),  and 
ammonium  (NH4)  are  first  multiplied 
with  dominant  eigenmodes  of  horizontal 
correlation  functions,  to  lead  to  a  three- 
dimensional  eigen-decomposition  of  the 
T,  S,  Chi,  N03,  and  NH4  covariance  ma¬ 
trix.  This  “observed”  decomposition  is 
then  used  to  perturb  the  initial  state  and 
estimate  the  “non-observed”  uncertainty 
by  dynamical  model  integration.  ESSE 
is  then  started  from  this  dynamically 
adjusted  error  subspace  on  August  25.  A 
hindcast  of  600  perturbed  runs,  forced 
with  physical  stochastic  noise,  is  then 
carried  out  for  September  2, 1998. 

The  hindcast  is  illustrated  on  Figure 
7  by  uncertainties  of  Chi  at  20-m  depth 
(around  the  sub-surface  Chi  maxima). 
Shown  are  the  mean  Chi  at  20  m,  its  er¬ 
ror  standard  deviation  estimate,  and 
eight  Chi  histograms  (PDF  estimates)  at 
various  locations.  Such  PDF  estimates 
fully  characterize  uncertainty.  Mean  am¬ 
plitudes  (top  right)  are  largest  along  the 
coastline  in  Cape  Cod  Bay,  in  response 
to  wind-driven  upwelling,  and  south  of 
Stellwagen  Bank  (marker  5)  that  is  an  ac¬ 
cumulation  region  where  whales  are  of¬ 
ten  found  in  late  summer.  Uncertainties 
(top  right)  are  largest  in  the  center  and 


mouth  of  the  Bay  and  near  recent  coastal 
upwellings.  Near  Stellwagen  Bank,  maxi¬ 
mum  uncertainties  are  more  at  the  edges 
than  at  the  peaks  of  the  Chi  maxima. 


regional  interdisciplinary  applications. 
The  representation,  attribution  and 
propagation  of  four-dimensional  oceanic 
uncertainties  presents  many  interest¬ 


These  (research)  efforts  include  ocean 

observation  campaigns  dedicated  to 
uncertainty  modeling,  interdisciplinary  data 

assimilation,  ocean  stochastic  modeling, 
new  computational  methods,  adaptive 

modeling,  and  adaptive  sampling  research. 


This  location  is  due  to  uncertainties  in 
the  burgeoning  fall  blooms  and  advec- 
tive  features  (stronger  currents  are  also 
along  these  edges).  Looking  at  the  PDF 
estimates,  PDFs  1,  5,  6,  7,  and  8  are 
steeper  than  a  Gaussian  of  identical  stan¬ 
dard  deviation,  while  PDFs  3  and  4  are 
closer  to  a  Gaussian.  PDF  2  is  a  bit  flat¬ 
ter  than  a  Gaussian  because  it  combines 
two  PDF  peaks  from  nearby  locations 
(south:  lower  Chi  in  a  Gulf  of  Maine  in¬ 
flow;  north:  higher  Chi  in  the  eddy  field 
of  the  coastal  current).  PDFs  5  and  7  are 
skewed  towards  lower  Chi  values  because 
they  are  near  the  low  Chi  jet  exiting  Mas¬ 
sachusetts  Bay  from  the  center  of  Cape 
Cod  Bay.  PDF  8,  east  of  Cape  Cod,  is 
skewed  towards  positive  values  because  it 
is  near  the  high  Chi  content  of  the  Gulf 
of  Maine  coastal  current,  flowing  in  and 
out  of  Massachusetts  Bay. 

CONCLUSIONS 

The  computational  aspects  of  data-driv¬ 
en  modeling  and  prediction  of  uncer¬ 
tainties  were  outlined  and  exemplified  by 


ing  challenges  and  requires  increased 
theoretical  and  applied  research  efforts. 
These  efforts  include  ocean  observa¬ 
tion  campaigns  dedicated  to  uncertainty 
modeling,  interdisciplinary  data  assimi¬ 
lation,  ocean  stochastic  modeling,  new 
computational  methods,  adaptive  mod¬ 
eling,  and  adaptive  sampling  research. 
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Figure  7.  Chlorophyll  a  (Chi)  mean  and  uncertainties  at  20-m  depth  in  the  Mass  Bay  region  on  September  2, 1998,  as  hindcast  by  600  Error  Subspace  Statis¬ 
tical  Estimation  (ESSE)  ensemble  members.  ESSE  was  initialized  on  August  25, 1998.  (Top  left/right)  Mean/Error  Standard  Deviation  of  Chi.  (Bottom)  Eight 
PDF  estimates  (normalized  histograms,  numbered  1  to  8)  corresponding  to  the  eight  marked  locations  on  the  horizontal  maps.  Bars  on  the  histograms  are 
colored  according  to  the  center  Chi  value.  The  minimum,  mean,  standard  deviation,  and  maximum  values  are  given  on  each  histogram  (illustration  by  R.G. 
Hero,  University  of  California,  Santa  Cruz). 
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